Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 153116 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 133.3 MiB |
| Average record size in memory | 913.1 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 8 |
Reproduction
| Analysis started | 2020-03-04 09:27:18.943790 |
|---|---|
| Analysis finished | 2020-03-04 09:33:23.714245 |
| Version | pandas-profiling v2.5.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
주민번호 has a high cardinality: 46136 distinct values | High cardinality |
가입일자 has a high cardinality: 3284 distinct values | High cardinality |
최종불입일자 has a high cardinality: 2989 distinct values | High cardinality |
담당자 has a high cardinality: 1453 distinct values | High cardinality |
부서 has a high cardinality: 449 distinct values | High cardinality |
가입일자 only contains datetime values, but is categorical. Consider applying pd.to_datetime() | Type |
최종불입일자 only contains datetime values, but is categorical. Consider applying pd.to_datetime() | Type |
해약금액 has 122654 (80.1%) zeros | Zeros |
연체횟수 has 73514 (48.0%) zeros | Zeros |
| Distinct count | 153116 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88538.39248021109 |
|---|---|
| Minimum | 0 |
| Maximum | 173196 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8545.75 |
| Q1 | 46955.75 |
| median | 89654.5 |
| Q3 | 130969.25 |
| 95-th percentile | 164510.5 |
| Maximum | 173196 |
| Range | 173196 |
| Interquartile range (IQR) | 84013.5 |
Descriptive statistics
| Standard deviation | 49427.24275 |
|---|---|
| Coefficient of variation (CV) | 0.5582577385 |
| Kurtosis | -1.152838065 |
| Mean | 88538.39248 |
| Median Absolute Deviation (MAD) | 42528.61942 |
| Skewness | -0.05564599879 |
| Sum | 1.35566445e+10 |
| Variance | 2443052326 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 11342.5 19638.5 36391.5 53460.5 ... 169139. 169262.5 172048.5 172094. 173196. ], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 56625 | 1 | < 0.1% | |
| 9518 | 1 | < 0.1% | |
| 15661 | 1 | < 0.1% | |
| 13612 | 1 | < 0.1% | |
| 3371 | 1 | < 0.1% | |
| 1322 | 1 | < 0.1% | |
| 7465 | 1 | < 0.1% | |
| 5416 | 1 | < 0.1% | |
| 25894 | 1 | < 0.1% | |
| Other values (153106) | 153106 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 173196 | 1 | < 0.1% | |
| 173195 | 1 | < 0.1% | |
| 173194 | 1 | < 0.1% | |
| 173193 | 1 | < 0.1% | |
| 173191 | 1 | < 0.1% |
| Distinct count | 46136 |
|---|---|
| Unique (%) | 30.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 110111-0000000 | 70 |
|---|---|
| 610315-2000000 | 28 |
| 610310-2000000 | 26 |
| 641220-2000000 | 26 |
| 600315-2000000 | 25 |
| Other values (46131) |
| Value | Count | Frequency (%) | |
| 110111-0000000 | 70 | < 0.1% | |
| 610315-2000000 | 28 | < 0.1% | |
| 610310-2000000 | 26 | < 0.1% | |
| 641220-2000000 | 26 | < 0.1% | |
| 600315-2000000 | 25 | < 0.1% | |
| 620215-2000000 | 25 | < 0.1% | |
| 470529-1000000 | 24 | < 0.1% | |
| 660305-2000000 | 24 | < 0.1% | |
| 430210-0000000 | 24 | < 0.1% | |
| 600815-2000000 | 23 | < 0.1% | |
| Other values (46126) | 152821 | 99.8% |
Length
| Max length | 14 |
|---|---|
| Mean length | 13.99935996 |
| Min length | 12 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Dash_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
주소
Categorical
| Distinct count | 33 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 경기 | |
|---|---|
| 서울 | |
| 인천 | |
| 경상 | 8978 |
| 광주 | 7264 |
| Other values (28) |
| Value | Count | Frequency (%) | |
| 경기 | 39708 | 25.9% | |
| 서울 | 39669 | 25.9% | |
| 인천 | 11482 | 7.5% | |
| 경상 | 8978 | 5.9% | |
| 광주 | 7264 | 4.7% | |
| 부산 | 6756 | 4.4% | |
| 강원 | 6431 | 4.2% | |
| 전라 | 5606 | 3.7% | |
| 충청 | 5588 | 3.6% | |
| 대구 | 4660 | 3.0% | |
| Other values (23) | 16974 | 11.1% |
Length
| Max length | 2 |
|---|---|
| Mean length | 1.999954283 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Other_Letter | 41 | 97.6% | |
| Space_Separator | 1 | 2.4% |
| Value | Count | Frequency (%) | |
| Hangul | 41 | 97.6% | |
| Common | 1 | 2.4% |
| Value | Count | Frequency (%) | |
| Hangul | 41 | 97.6% | |
| ASCII | 1 | 2.4% |
상태
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 해약 | |
|---|---|
| 가입 | |
| 만기 | 10895 |
| 행사 | 8162 |
| 만기_해약 | 2663 |
| Value | Count | Frequency (%) | |
| 해약 | 68695 | 44.9% | |
| 가입 | 62701 | 40.9% | |
| 만기 | 10895 | 7.1% | |
| 행사 | 8162 | 5.3% | |
| 만기_해약 | 2663 | 1.7% |
Length
| Max length | 5 |
|---|---|
| Mean length | 2.052176128 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Other_Letter | 8 | 88.9% | |
| Connector_Punctuation | 1 | 11.1% |
| Value | Count | Frequency (%) | |
| Hangul | 8 | 88.9% | |
| Common | 1 | 11.1% |
| Value | Count | Frequency (%) | |
| Hangul | 8 | 88.9% | |
| ASCII | 1 | 11.1% |
| Distinct count | 3284 |
|---|---|
| Unique (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 2014-03-03 | 642 |
|---|---|
| 2014-07-28 | 577 |
| 2014-08-25 | 562 |
| 2014-07-07 | 532 |
| 2014-06-02 | 526 |
| Other values (3279) |
| Value | Count | Frequency (%) | |
| 2014-03-03 | 642 | 0.4% | |
| 2014-07-28 | 577 | 0.4% | |
| 2014-08-25 | 562 | 0.4% | |
| 2014-07-07 | 532 | 0.3% | |
| 2014-06-02 | 526 | 0.3% | |
| 2016-07-25 | 514 | 0.3% | |
| 2016-07-11 | 484 | 0.3% | |
| 2014-10-27 | 477 | 0.3% | |
| 2014-12-22 | 456 | 0.3% | |
| 2014-09-22 | 443 | 0.3% | |
| Other values (3274) | 147903 | 96.6% |
Length
| Max length | 10 |
|---|---|
| Mean length | 10 |
| Min length | 10 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Dash_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
| Distinct count | 2989 |
|---|---|
| Unique (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 2017-12-26 | |
|---|---|
| 2017-12-20 | 8216 |
| 2017-12-15 | 7343 |
| 2017-12-11 | 6796 |
| 2017-12-05 | 4818 |
| Other values (2984) |
| Value | Count | Frequency (%) | |
| 2017-12-26 | 25541 | 16.7% | |
| 2017-12-20 | 8216 | 5.4% | |
| 2017-12-15 | 7343 | 4.8% | |
| 2017-12-11 | 6796 | 4.4% | |
| 2017-12-05 | 4818 | 3.1% | |
| 2017-12-28 | 1230 | 0.8% | |
| 2017-11-27 | 963 | 0.6% | |
| 2017-03-27 | 689 | 0.4% | |
| 2017-01-25 | 660 | 0.4% | |
| 2017-07-25 | 606 | 0.4% | |
| Other values (2979) | 96254 | 62.9% |
Length
| Max length | 10 |
|---|---|
| Mean length | 10 |
| Min length | 10 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Dash_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
총납입회차
Real number (ℝ≥0)
| Distinct count | 22 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 230.86704851223908 |
|---|---|
| Minimum | 1 |
| Maximum | 390 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 100 |
| Q1 | 100 |
| median | 140 |
| Q3 | 360 |
| 95-th percentile | 390 |
| Maximum | 390 |
| Range | 389 |
| Interquartile range (IQR) | 260 |
Descriptive statistics
| Standard deviation | 129.0717396 |
|---|---|
| Coefficient of variation (CV) | 0.5590738931 |
| Kurtosis | -1.84138557 |
| Mean | 230.8670485 |
| Median Absolute Deviation (MAD) | 125.1568329 |
| Skewness | 0.1519264969 |
| Sum | 35349439 |
| Variance | 16659.51396 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 43.5 55. 62. 87.5 ... 262. 282. 330. 375. 390. ], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 100 | 47891 | 31.3% | |
| 360 | 32925 | 21.5% | |
| 390 | 32042 | 20.9% | |
| 140 | 15083 | 9.9% | |
| 260 | 7509 | 4.9% | |
| 130 | 6731 | 4.4% | |
| 120 | 4961 | 3.2% | |
| 60 | 3451 | 2.3% | |
| 160 | 1273 | 0.8% | |
| 195 | 597 | 0.4% | |
| Other values (12) | 653 | 0.4% |
| Value | Count | Frequency (%) | |
| 1 | 2 | < 0.1% | |
| 39 | 1 | < 0.1% | |
| 48 | 7 | < 0.1% | |
| 50 | 4 | < 0.1% | |
| 60 | 3451 | 2.3% |
| Value | Count | Frequency (%) | |
| 390 | 32042 | 20.9% | |
| 360 | 32925 | 21.5% | |
| 300 | 47 | < 0.1% | |
| 264 | 217 | 0.1% | |
| 260 | 7509 | 4.9% |
최종불입회차
Real number (ℝ≥0)
| Distinct count | 136 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.52613704642232 |
|---|---|
| Minimum | 1 |
| Maximum | 488 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 26 |
| Q3 | 51 |
| 95-th percentile | 100 |
| Maximum | 488 |
| Range | 487 |
| Interquartile range (IQR) | 47 |
Descriptive statistics
| Standard deviation | 40.17973631 |
|---|---|
| Coefficient of variation (CV) | 1.100026982 |
| Kurtosis | 15.35641539 |
| Mean | 36.52613705 |
| Median Absolute Deviation (MAD) | 30.11413526 |
| Skewness | 2.560806969 |
| Sum | 5592736 |
| Variance | 1614.41121 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 2.5 3.5 4.5 ... 280. 330. 375. 439. 488. ], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 1 | 21268 | 13.9% | |
| 100 | 17944 | 11.7% | |
| 2 | 8001 | 5.2% | |
| 3 | 6334 | 4.1% | |
| 4 | 4991 | 3.3% | |
| 5 | 3218 | 2.1% | |
| 42 | 2866 | 1.9% | |
| 6 | 2539 | 1.7% | |
| 7 | 2400 | 1.6% | |
| 8 | 2324 | 1.5% | |
| Other values (126) | 81231 | 53.1% |
| Value | Count | Frequency (%) | |
| 1 | 21268 | 13.9% | |
| 2 | 8001 | 5.2% | |
| 3 | 6334 | 4.1% | |
| 4 | 4991 | 3.3% | |
| 5 | 3218 | 2.1% |
| Value | Count | Frequency (%) | |
| 488 | 1 | < 0.1% | |
| 390 | 158 | 0.1% | |
| 360 | 378 | 0.2% | |
| 300 | 1 | < 0.1% | |
| 260 | 44 | < 0.1% |
은행
Categorical
| Distinct count | 42 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 국민 | |
|---|---|
| 농협단위조합 | |
| 농협중앙회 | |
| 신한 | |
| 우리 | |
| Other values (37) |
| Value | Count | Frequency (%) | |
| 국민 | 32528 | 21.2% | |
| 농협단위조합 | 30221 | 19.7% | |
| 농협중앙회 | 20183 | 13.2% | |
| 신한 | 15732 | 10.3% | |
| 우리 | 14496 | 9.5% | |
| 기업 | 6321 | 4.1% | |
| 하나 | 6287 | 4.1% | |
| 우체국 | 5514 | 3.6% | |
| 새마을금고연합회 | 4217 | 2.8% | |
| 대구 | 2702 | 1.8% | |
| Other values (32) | 14915 | 9.7% |
Length
| Max length | 12 |
|---|---|
| Mean length | 3.665919956 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Other_Letter | 82 | 92.1% | |
| Uppercase_Letter | 6 | 6.7% | |
| Space_Separator | 1 | 1.1% |
| Value | Count | Frequency (%) | |
| Hangul | 82 | 92.1% | |
| Latin | 6 | 6.7% | |
| Common | 1 | 1.1% |
| Value | Count | Frequency (%) | |
| Hangul | 82 | 92.1% | |
| ASCII | 7 | 7.9% |
상품금액
Real number (ℝ≥0)
| Distinct count | 22 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3178320.776404817 |
|---|---|
| Minimum | 690000 |
| Maximum | 5364000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 690000 |
|---|---|
| 5-th percentile | 1980000 |
| Q1 | 2400000 |
| median | 3600000 |
| Q3 | 3900000 |
| 95-th percentile | 3900000 |
| Maximum | 5364000 |
| Range | 4674000 |
| Interquartile range (IQR) | 1500000 |
Descriptive statistics
| Standard deviation | 693113.3697 |
|---|---|
| Coefficient of variation (CV) | 0.2180753355 |
| Kurtosis | -1.387084503 |
| Mean | 3178320.776 |
| Median Absolute Deviation (MAD) | 647152.2876 |
| Skewness | -0.3402970729 |
| Sum | 4.86651764e+11 |
| Variance | 4.804061433e+11 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 690000. 1890000. 1990000. 2140000. 2340000. ... 4758000. 4956000. 5156000. 5282000. 5364000.], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 3900000 | 47474 | 31.0% | |
| 3600000 | 34835 | 22.8% | |
| 2400000 | 31615 | 20.6% | |
| 2800000 | 21757 | 14.2% | |
| 1980000 | 13098 | 8.6% | |
| 3000000 | 2309 | 1.5% | |
| 3300000 | 810 | 0.5% | |
| 2640000 | 268 | 0.2% | |
| 1800000 | 224 | 0.1% | |
| 2975000 | 182 | 0.1% | |
| Other values (12) | 544 | 0.4% |
| Value | Count | Frequency (%) | |
| 690000 | 2 | < 0.1% | |
| 1800000 | 224 | 0.1% | |
| 1980000 | 13098 | 8.6% | |
| 2000000 | 50 | < 0.1% | |
| 2280000 | 8 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5364000 | 48 | < 0.1% | |
| 5200000 | 1 | < 0.1% | |
| 5112000 | 78 | 0.1% | |
| 4800000 | 26 | < 0.1% | |
| 4716000 | 56 | < 0.1% |
총불입액
Real number (ℝ≥0)
| Distinct count | 873 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 712122.1888960004 |
|---|---|
| Minimum | 750 |
| Maximum | 4880000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 750 |
|---|---|
| 5-th percentile | 10000 |
| Q1 | 60000 |
| median | 350000 |
| Q3 | 1148000 |
| 95-th percentile | 2400000 |
| Maximum | 4880000 |
| Range | 4879250 |
| Interquartile range (IQR) | 1088000 |
Descriptive statistics
| Standard deviation | 861685.3838 |
|---|---|
| Coefficient of variation (CV) | 1.210024624 |
| Kurtosis | 0.3129419121 |
| Mean | 712122.1889 |
| Median Absolute Deviation (MAD) | 708763.915 |
| Skewness | 1.25307604 |
| Sum | 1.090373011e+11 |
| Variance | 7.425017006e+11 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[7.500e+02 8.750e+02 1.250e+03 3.375e+03 3.875e+03 ... 3.342e+06 3.528e+06 3.750e+06 3.990e+06 4.880e+06], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 2400000 | 12722 | 8.3% | |
| 10000 | 11793 | 7.7% | |
| 1980000 | 6129 | 4.0% | |
| 20000 | 5481 | 3.6% | |
| 30000 | 4534 | 3.0% | |
| 60000 | 3761 | 2.5% | |
| 120000 | 3101 | 2.0% | |
| 40000 | 2804 | 1.8% | |
| 420000 | 2599 | 1.7% | |
| 24000 | 2484 | 1.6% | |
| Other values (863) | 97708 | 63.8% |
| Value | Count | Frequency (%) | |
| 750 | 137 | 0.1% | |
| 1000 | 49 | < 0.1% | |
| 1500 | 6 | < 0.1% | |
| 2000 | 7 | < 0.1% | |
| 2250 | 8 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4880000 | 1 | < 0.1% | |
| 4848000 | 1 | < 0.1% | |
| 4800000 | 3 | < 0.1% | |
| 4640000 | 1 | < 0.1% | |
| 4500000 | 1 | < 0.1% |
| Distinct count | 1661 |
|---|---|
| Unique (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 115545.8513284046 |
|---|---|
| Minimum | 0 |
| Maximum | 3876000 |
| Zeros | 122654 |
| Zeros (%) | 80.1% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1032000 |
| Maximum | 3876000 |
| Range | 3876000 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 375223.8071 |
|---|---|
| Coefficient of variation (CV) | 3.247401813 |
| Kurtosis | 13.58022806 |
| Mean | 115545.8513 |
| Median Absolute Deviation (MAD) | 200814.4248 |
| Skewness | 3.695094923 |
| Sum | 1.769191857e+10 |
| Variance | 1.407929054e+11 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000000e+00 3.150000e+02 6.900000e+02 8.000000e+02 1.135000e+03 ... 2.628225e+06 2.628725e+06 2.650710e+06 2.736500e+06 3.876000e+06], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 0 | 122654 | 80.1% | |
| 10000 | 6748 | 4.4% | |
| 20000 | 1373 | 0.9% | |
| 1944000 | 1235 | 0.8% | |
| 24000 | 1097 | 0.7% | |
| 1603000 | 930 | 0.6% | |
| 15000 | 744 | 0.5% | |
| 28000 | 481 | 0.3% | |
| 30000 | 150 | 0.1% | |
| 750 | 120 | 0.1% | |
| Other values (1651) | 17584 | 11.5% |
| Value | Count | Frequency (%) | |
| 0 | 122654 | 80.1% | |
| 630 | 17 | < 0.1% | |
| 750 | 120 | 0.1% | |
| 850 | 16 | < 0.1% | |
| 1000 | 33 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3876000 | 2 | < 0.1% | |
| 3664000 | 1 | < 0.1% | |
| 3600000 | 1 | < 0.1% | |
| 3163310 | 1 | < 0.1% | |
| 3000000 | 2 | < 0.1% |
납입방법
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| CMS | |
|---|---|
| 카드 | 1770 |
| 무통장 | 562 |
| 현금 | 6 |
| Value | Count | Frequency (%) | |
| CMS | 150778 | 98.5% | |
| 카드 | 1770 | 1.2% | |
| 무통장 | 562 | 0.4% | |
| 현금 | 6 | < 0.1% |
Length
| Max length | 3 |
|---|---|
| Mean length | 2.988400951 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Other_Letter | 7 | 70.0% | |
| Uppercase_Letter | 3 | 30.0% |
| Value | Count | Frequency (%) | |
| Hangul | 7 | 70.0% | |
| Latin | 3 | 30.0% |
| Value | Count | Frequency (%) | |
| Hangul | 7 | 70.0% | |
| ASCII | 3 | 30.0% |
| Distinct count | 1453 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 더피플라이프 | |
|---|---|
| 금강종합상조(주) | |
| 강대석 | 6252 |
| 김영권 | 4854 |
| 이덕술 | 4746 |
| Other values (1448) |
| Value | Count | Frequency (%) | |
| 더피플라이프 | 63559 | 41.5% | |
| 금강종합상조(주) | 28681 | 18.7% | |
| 강대석 | 6252 | 4.1% | |
| 김영권 | 4854 | 3.2% | |
| 이덕술 | 4746 | 3.1% | |
| 김영경 | 3930 | 2.6% | |
| 심상열 | 1818 | 1.2% | |
| 고달진 | 1703 | 1.1% | |
| 김일성 | 1382 | 0.9% | |
| 안미나 | 1181 | 0.8% | |
| Other values (1443) | 35010 | 22.9% |
Length
| Max length | 10 |
|---|---|
| Mean length | 5.372110034 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Other_Letter | 227 | 98.3% | |
| Uppercase_Letter | 2 | 0.9% | |
| Close_Punctuation | 1 | 0.4% | |
| Open_Punctuation | 1 | 0.4% |
| Value | Count | Frequency (%) | |
| Hangul | 227 | 98.3% | |
| Common | 2 | 0.9% | |
| Latin | 2 | 0.9% |
| Value | Count | Frequency (%) | |
| Hangul | 227 | 98.3% | |
| ASCII | 4 | 1.7% |
| Distinct count | 449 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 김선 | |
|---|---|
| 차용갑 | |
| 관리부 | |
| 직영팀 | |
| 특판영업 | 10421 |
| Other values (444) |
| Value | Count | Frequency (%) | |
| 김선 | 20524 | 13.4% | |
| 차용갑 | 20441 | 13.4% | |
| 관리부 | 19657 | 12.8% | |
| 직영팀 | 15346 | 10.0% | |
| 특판영업 | 10421 | 6.8% | |
| 강대석 | 7831 | 5.1% | |
| 이진희 | 6187 | 4.0% | |
| 이덕술 | 6070 | 4.0% | |
| CJ오쇼핑 | 5549 | 3.6% | |
| 이길숙 | 4422 | 2.9% | |
| Other values (439) | 36668 | 23.9% |
Length
| Max length | 12 |
|---|---|
| Mean length | 3.246492855 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Other_Letter | 237 | 94.0% | |
| Uppercase_Letter | 8 | 3.2% | |
| Decimal_Number | 4 | 1.6% | |
| Close_Punctuation | 1 | 0.4% | |
| Open_Punctuation | 1 | 0.4% | |
| Other_Punctuation | 1 | 0.4% |
| Value | Count | Frequency (%) | |
| Hangul | 237 | 94.0% | |
| Latin | 8 | 3.2% | |
| Common | 7 | 2.8% |
| Value | Count | Frequency (%) | |
| Hangul | 237 | 94.0% | |
| ASCII | 15 | 6.0% |
| Distinct count | 318 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.484456229264087 |
|---|---|
| Minimum | -459 |
| Maximum | 119 |
| Zeros | 73514 |
| Zeros (%) | 48.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | -459 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 39 |
| 95-th percentile | 95 |
| Maximum | 119 |
| Range | 578 |
| Interquartile range (IQR) | 39 |
Descriptive statistics
| Standard deviation | 38.2442983 |
|---|---|
| Coefficient of variation (CV) | 1.866991141 |
| Kurtosis | 24.27641614 |
| Mean | 20.48445623 |
| Median Absolute Deviation (MAD) | 26.84935916 |
| Skewness | -2.094260032 |
| Sum | 3136498 |
| Variance | 1462.626353 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-459. -383. -373.5 -354.5 -351.5 ... 98.5 99.5 116.5 118.5 119. ], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 0 | 73514 | 48.0% | |
| 1 | 3196 | 2.1% | |
| 99 | 3194 | 2.1% | |
| 45 | 1754 | 1.1% | |
| 38 | 1494 | 1.0% | |
| 41 | 1435 | 0.9% | |
| 37 | 1411 | 0.9% | |
| 2 | 1377 | 0.9% | |
| 98 | 1293 | 0.8% | |
| 46 | 1264 | 0.8% | |
| Other values (308) | 63184 | 41.3% |
| Value | Count | Frequency (%) | |
| -459 | 1 | < 0.1% | |
| -386 | 1 | < 0.1% | |
| -380 | 2 | < 0.1% | |
| -379 | 2 | < 0.1% | |
| -378 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 119 | 111 | 0.1% | |
| 118 | 63 | < 0.1% | |
| 117 | 59 | < 0.1% | |
| 116 | 36 | < 0.1% | |
| 115 | 37 | < 0.1% |
성별
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 여 | |
|---|---|
| 남 |
| Value | Count | Frequency (%) | |
| 여 | 80553 | 52.6% | |
| 남 | 72563 | 47.4% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Other_Letter | 2 | 100.0% |
| Value | Count | Frequency (%) | |
| Hangul | 2 | 100.0% |
| Value | Count | Frequency (%) | |
| Hangul | 2 | 100.0% |
나이
Real number (ℝ≥0)
| Distinct count | 89 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 57.952415162360566 |
|---|---|
| Minimum | 21 |
| Maximum | 120 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 35 |
| Q1 | 49 |
| median | 58 |
| Q3 | 68 |
| 95-th percentile | 80 |
| Maximum | 120 |
| Range | 99 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 13.49965084 |
|---|---|
| Coefficient of variation (CV) | 0.2329437143 |
| Kurtosis | -0.4155432677 |
| Mean | 57.95241516 |
| Median Absolute Deviation (MAD) | 10.95249507 |
| Skewness | -0.03494343087 |
| Sum | 8873442 |
| Variance | 182.2405728 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 21. 22.5 24.5 25.5 26.5 ... 105. 108.5 109.5 119.5 120. ], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 60 | 4880 | 3.2% | |
| 59 | 4754 | 3.1% | |
| 61 | 4521 | 3.0% | |
| 58 | 4505 | 2.9% | |
| 62 | 4315 | 2.8% | |
| 56 | 4298 | 2.8% | |
| 55 | 4253 | 2.8% | |
| 52 | 4181 | 2.7% | |
| 63 | 4158 | 2.7% | |
| 65 | 4050 | 2.6% | |
| Other values (79) | 109201 | 71.3% |
| Value | Count | Frequency (%) | |
| 21 | 2 | < 0.1% | |
| 22 | 16 | < 0.1% | |
| 23 | 39 | < 0.1% | |
| 24 | 32 | < 0.1% | |
| 25 | 81 | 0.1% |
| Value | Count | Frequency (%) | |
| 120 | 7 | < 0.1% | |
| 119 | 3 | < 0.1% | |
| 118 | 1 | < 0.1% | |
| 117 | 1 | < 0.1% | |
| 114 | 2 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
| df_index | 주민번호 | 주소 | 상태 | 가입일자 | 최종불입일자 | 총납입회차 | 최종불입회차 | 은행 | 상품금액 | 총불입액 | 해약금액 | 납입방법 | 담당자 | 부서 | 연체횟수 | 성별 | 나이 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 590318-1000000 | 경기 | 만기 | 2008-09-08 | 2016-12-20 | 100 | 100 | 농협단위조합 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 남 | 61 |
| 1 | 1 | 581125-1000000 | 경기 | 행사 | 2007-09-28 | 2012-10-04 | 100 | 100 | 국민 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 남 | 62 |
| 2 | 2 | 831121-1000000 | 부산 | 가입 | 2007-10-23 | 2013-05-20 | 100 | 68 | 농협중앙회 | 2400000 | 1632000 | 0 | CMS | 더피플라이프 | 관리부 | 32 | 남 | 37 |
| 3 | 3 | 821023-1000000 | 울산 | 행사 | 2007-10-23 | 2014-12-31 | 100 | 100 | 국민 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 남 | 38 |
| 4 | 4 | 340728-2000000 | 경기 | 만기 | 2007-10-31 | 2016-01-05 | 100 | 100 | 우리 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 여 | 86 |
| 5 | 5 | 520206-1000000 | 서울 | 해약 | 2007-11-19 | 2013-08-05 | 100 | 70 | 기업 | 2400000 | 1680000 | 1212000 | CMS | 더피플라이프 | 관리부 | 30 | 남 | 68 |
| 6 | 6 | 760210-1000000 | 서울 | 만기 | 2008-03-31 | 2016-06-20 | 100 | 100 | 국민 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 남 | 44 |
| 7 | 7 | 541016-2000000 | 충남 | 가입 | 2008-04-14 | 2017-12-20 | 100 | 97 | 농협단위조합 | 2400000 | 2328000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 여 | 66 |
| 8 | 8 | 521001-2000000 | 서울 | 해약 | 2008-09-23 | 2016-08-25 | 100 | 96 | 신한 | 2400000 | 2304000 | 1866000 | CMS | 더피플라이프 | 관리부 | 4 | 여 | 68 |
| 9 | 9 | 530815-2000000 | 서울 | 만기 | 2008-09-23 | 2016-12-30 | 100 | 100 | 농협중앙회 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 여 | 67 |
Last rows
| df_index | 주민번호 | 주소 | 상태 | 가입일자 | 최종불입일자 | 총납입회차 | 최종불입회차 | 은행 | 상품금액 | 총불입액 | 해약금액 | 납입방법 | 담당자 | 부서 | 연체횟수 | 성별 | 나이 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 153106 | 173184 | 320814-2000000 | 부산 | 해약 | 2008-12-19 | 2013-09-23 | 100 | 58 | 우체국 | 2400000 | 1392000 | 912000 | CMS | 더피플라이프 | 관리부 | 42 | 여 | 88 |
| 153107 | 173185 | 520228-1000000 | 부산 | 해약 | 2008-12-19 | 2015-10-20 | 100 | 83 | 국민 | 2400000 | 1992000 | 1613000 | CMS | 더피플라이프 | 관리부 | 17 | 남 | 68 |
| 153108 | 173186 | 660312-2000000 | 부산 | 해약 | 2008-12-22 | 2013-10-14 | 100 | 54 | 농협중앙회 | 2400000 | 1296000 | 828000 | CMS | 더피플라이프 | 관리부 | 46 | 여 | 54 |
| 153109 | 173189 | 370127-1000000 | 경남 | 해약 | 2008-12-23 | 2010-07-26 | 100 | 20 | 국민 | 2400000 | 480000 | 0 | CMS | 더피플라이프 | 관리부 | 80 | 남 | 83 |
| 153110 | 173190 | 561026-2000000 | 부산 | 해약 | 2009-01-07 | 2009-03-20 | 100 | 3 | 농협단위조합 | 2400000 | 72000 | 0 | CMS | 더피플라이프 | 관리부 | 97 | 여 | 64 |
| 153111 | 173191 | 761012-1000000 | 부산 | 해약 | 2009-01-07 | 2009-04-27 | 100 | 3 | 국민 | 2400000 | 72000 | 0 | CMS | 더피플라이프 | 관리부 | 97 | 남 | 44 |
| 153112 | 173193 | 820807-1000000 | 부산 | 해약 | 2008-12-16 | 2012-06-29 | 60 | 43 | 부산 | 2400000 | 1720000 | 1240000 | CMS | 더피플라이프 | 관리부 | 17 | 남 | 38 |
| 153113 | 173194 | 830402-1000000 | 부산 | 만기 | 2008-12-16 | 2013-11-25 | 60 | 60 | 새마을금고연합회 | 2400000 | 2400000 | 0 | CMS | 더피플라이프 | 관리부 | 0 | 남 | 37 |
| 153114 | 173195 | 490809-2000000 | 부산 | 해약 | 2008-12-16 | 2008-12-18 | 60 | 1 | 부산 | 2400000 | 40000 | 0 | CMS | 더피플라이프 | 관리부 | 59 | 여 | 71 |
| 153115 | 173196 | 540801-2000000 | 부산 | 해약 | 2008-12-16 | 2009-11-25 | 60 | 12 | 국민 | 2400000 | 480000 | 0 | CMS | 더피플라이프 | 관리부 | 48 | 여 | 66 |